Skip to content

Replace tmux with a native session backend#22

Merged
aksOps merged 8 commits into
mainfrom
claude/remove-tmux-dependency-f8dw47
Jun 11, 2026
Merged

Replace tmux with a native session backend#22
aksOps merged 8 commits into
mainfrom
claude/remove-tmux-dependency-f8dw47

Conversation

@aksOps

@aksOps aksOps commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Removes the tmux requirement entirely. Every managed agent now runs under a small detached host process owned by uam itself (internal/session): the host holds the agent's PTY, renders output through an in-process terminal emulator (internal/vterm), and serves peek/reply/attach/kill over a per-session Unix socket. Hosts outlive the uam process, so sessions survive TUI exit, terminal close, and logout — the same lifetime contract the private tmux server provided, with zero external dependencies.

What's in here

Native session backend (tmux removed)

  • Per-session detached host: PTY via creack/pty, 4000-line scrollback through a minimal VT100/xterm emulator, control socket in an owner-only per-UID runtime dir
  • Dispatch/list/peek/reply/stop/resume/kill-all keep their exact contracts; capture output matches capture-pane -p -J semantics (plain text, soft-wrap joined)
  • Native attach client replaces tmux attach: raw-mode bridge, screen replay + resize nudge on attach, multi-client attach, Ctrl+B d to detach
  • Agent exit recorded in-process (replaces the tmux session-closed hook), including the exit code (last_exit_code)
  • No shell anywhere on the dispatch path — agent argv is exec'd directly
  • On-disk store format unchanged (tmux_session JSON key kept), so existing sessions.json files load as-is

Quick detach ()

  • A bare left arrow detaches when nothing has been typed since the last submit/clear (Claude-Code-style hop back to the dashboard); inside a draft it moves the cursor. Opt out with UAM_ATTACH_BACK_DETACH=0

Robustness hardening

  • Runtime dir moved off $XDG_RUNTIME_DIR (logind deletes it on logout, stranding live hosts) to /tmp/uam-<uid>, ownership-verified — the tmux /tmp/tmux-<uid> rationale
  • PID liveness verified against kernel start times so a recycled PID is never mistaken for (or signalled as) a session
  • Hosts refresh their runtime files' mtimes so systemd-tmpfiles aging never collects an idle session

Exact-session resume

  • claude: dispatch seeds --session-id <uam-id> (capability-probed for older releases), resume targets --resume <id> instead of --continue's "most recent in cwd" heuristic; pre-upgrade records keep the --continue fallback
  • copilot: already exact via --name/--resume=; now records provider_session_id for parity
  • opencode: resumes --session <ses_…> whenever an id is known; codex stays on resume --last (CLI cannot preset ids yet)

Bug fixes found during the audit

  • Find could not match a live session by the full UUID dispatch prints (live list only knows the 8-char suffix); the record's full ID is restored during metadata merge
  • Two unbounded maps (TUI peek throttle, adapter PR-scan throttle) now prune dead sessions
  • store.Open failures in app.New were silently swallowed; now logged

Migration note

Sessions still running inside an old tmux -L uam server are not visible to the native backend — finish or stop them first (tmux -L uam kill-server). Stored records carry over unchanged and remain resumable.

Testing

  • Full suite green with -race; golangci-lint and gosec clean locally
  • Real end-to-end coverage: the session package and main_test.go spawn genuine detached hosts (test binary self-exec), covering dispatch → peek → reply → attach/detach → stop → exit-code persistence → kill-all
  • Manual smoke tests of the built binary for every feature, including the --session-id/--resume argv round-trip
  • Known gap: rendering fidelity against real agent TUIs (claude/codex not installed in the dev environment) — peek output may need cosmetic follow-ups, isolated to internal/vterm

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE


Generated by Claude Code

claude added 6 commits June 10, 2026 14:52
uam no longer requires tmux. Each managed agent now runs under a small
detached host process owned by uam itself (internal/session): the host
holds the agent's PTY, renders output through an in-process terminal
emulator (internal/vterm), and serves peek/reply/attach/kill over a
per-session Unix socket in an owner-only runtime dir. Hosts outlive the
uam process, so sessions still survive TUI exit exactly as they did
under the private tmux server.

Functional parity and beyond:
- dispatch/list/peek/reply/stop/resume/kill-all keep their contracts
  (capture is rendered plain text with soft-wrap joining, like
  capture-pane -p -J; SendLine keeps the multi-line prompt semantics)
- attach is now uam's own raw-mode bridge (Ctrl+B d to detach, Ctrl+B
  Ctrl+B for a literal prefix); screen replay plus a resize nudge makes
  full-screen agent TUIs repaint on attach; multiple clients can attach
- agent exit is recorded in-process (store.MarkSessionClosed) instead of
  via the tmux session-closed hook, and the agent's exit code is now
  persisted (last_exit_code) and exposed on Session
- no shell anywhere on the dispatch path: agent argv is exec'd directly
  (ShellJoin survives only for the interactive-shell alias fallback)
- 4000 lines of peek scrollback (vs tmux's default 2000)

The adapter layer now drives a Backend interface (TmuxAgent -> Agent),
with a recording fake (adapter/adaptertest) replacing the fake-tmux
shell-script test harness; main_test exercises the real host end to end
by routing __host/__attach through the test binary. The store schema is
unchanged on disk (the record field keeps its tmux_session JSON key) so
existing sessions.json files load as-is.

Bug fixes found along the way:
- Find could not match a live session by the full UUID dispatch prints
  (the live list only knows the 8-char name suffix); the stored record's
  full ID is now restored during metadata merge
- the TUI's lastPeekAt map and the adapter's PR-scan throttle map grew
  without bound across session lifetimes; both are pruned on refresh
- store.Open failures in app.New were silently discarded; now logged

New deps: creack/pty; charmbracelet/x/term and mattn/go-runewidth
promoted to direct.

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
Pressing a bare left arrow while attached now detaches back to the
dashboard when the agent's input box is (believed) empty, mirroring the
quick-escape UX of Claude-Code-style session managers. Ctrl+B d keeps
working unconditionally.

uam is a byte bridge and cannot see the agent's real input box, so
"empty" is approximated locally in the attach client's new stdinFilter
state machine: printables/tab and any forwarded escape sequence (which
may recall history or move through a menu) disarm the quick detach,
and Enter / bare Esc / Ctrl+C / Ctrl+U re-arm it. Modified arrows
(e.g. shift-left) and arrows inside a typed draft pass through
untouched. Bare Esc is detected via the end-of-chunk heuristic and
forwarded immediately so agent interrupts never lag.

Opt out with UAM_ATTACH_BACK_DETACH=0 for agents that bind a bare left
arrow themselves.

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
Three gaps in the native backend's crash/logout story:

- The runtime dir defaulted to $XDG_RUNTIME_DIR/uam, which logind
  deletes when the user's last login session ends — not only on reboot.
  Detached hosts survive logout, so their sockets/state files vanished
  out from under them: uam lost track of running agents and resume
  would spawn duplicates. Default is now a per-UID directory under the
  system temp dir (the same reasoning behind tmux's /tmp/tmux-<uid>),
  verified to be a real directory owned by the current user since the
  parent is sticky and shared.

- State files persist host/child PIDs, so a stale file from a crashed
  host could match a recycled PID: List would report a dead session
  alive, and Kill's orphan escalation could signal an unrelated process
  group. State now records each process's kernel start time
  (/proc/<pid>/stat field 22) and every persisted-PID probe verifies it
  (procAliveWithStart), degrading to the plain signal-0 check where
  /proc is unavailable.

- Living in /tmp exposes the files to systemd-tmpfiles age-based
  cleanup; the host now refreshes its state file and socket mtimes
  every 6h so a long-idle session is never collected.

README documents the new location, the logout rationale, and the
loginctl enable-linger note for KillUserProcesses=yes distros.

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
Resume previously relied on claude's --continue, whose "most recent
conversation in this cwd" heuristic resumes the wrong conversation when
several uam sessions share a directory.

Dispatch now seeds claude's own session id with the uam UUID via
--session-id, recorded in the store as provider_session_id (validated
against the UUID alphabet on load, since it is replayed as resume
argv). Resume targets that exact conversation with --resume <id>.

Seeding is gated on a per-binary capability probe (claude --help
advertising --session-id) so older claude releases — which reject
unknown flags at startup — keep the bare argv, and records without a
seeded id (pre-upgrade sessions, unsupported claude) keep the
--continue fallback.

Plumbing: adapter.Session/ResumeRequest gain ProviderSessionID, agents
gain an optional ProviderSession hook, and the service persists and
re-hydrates the id across refresh/resume. Codex stays on resume --last
(its CLI cannot preset session ids yet); copilot already resumed
exactly via --resume=<uam-id>.

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
copilot already resumed exactly (--name <uam-id> at dispatch,
--resume=<uam-id> on resume — the pattern the claude change copied);
it now also records the seeded name as provider_session_id so the
store reflects what resume targets.

opencode supports exact resume natively (--session ses_...) but, like
codex, cannot preset the id at launch. Resume now targets the exact
session whenever a record carries a provider_session_id, falling back
to -c (project's most recent) otherwise.

The persisted-id allow-list widens from the UUID alphabet to the id
alphabets the providers actually use (alnum/underscore/dash, no
leading dash so a value can never parse as a flag).

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
The ? overlay now lists the in-session keys (left-arrow quick detach
and the Ctrl+B d chord) so the attach UX is discoverable without the
README.

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
@socket-security

socket-security Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedgithub.com/​creack/​pty@​v1.1.2497100100100100

View full report

claude added 2 commits June 11, 2026 01:19
govulncheck flagged two Go 1.24.7 standard-library vulnerabilities
reachable from the session backend (GO-2026-4971 net, GO-2026-4602 os),
both fixed only in Go 1.25.x: pin toolchain go1.25.11 (the language
version stays go 1.24). golangci-lint in CI moves v2.1.6 -> v2.5.0 to
match a Go 1.25 toolchain (verified clean locally on the same version).

The SonarCloud quality gate failed on new-code coverage (70.3% < 80%),
largely a measurement artifact: session hosts run as child processes in
tests, invisible to the Go coverage profiler. Drive a full host
lifecycle in-process (control ops, attach replay/stdin/resize frames,
unknown-op arm, kill, agent-exit cleanup) so the host runtime is
profiled, and add targeted tests for the remaining cold branches:
vterm escapes (save/restore cursor, reverse index, scroll up/down,
reset, ICH/ECH/EL1/ED1, legacy ?47 alt screen, DCS/OSC-ST, malformed
CSI recovery), Kill's no-socket escalation paths (wedged host, orphaned
agent group), RunHost/RunAttach argument errors, and the CLI's internal
subcommand routing. The sonar test step now measures cross-package
coverage (-coverpkg=./...) so e2e exercise of internal packages counts.

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
setup-go installs the version from go.mod's go directive, so with
go 1.24.0 + toolchain go1.25.11 the CI golangci-lint was compiled with
Go 1.24 and refused to target the 1.25.11 toolchain ("Go language
version used to build golangci-lint is lower than the targeted Go
version"). Aligning the language directive with the 1.25 toolchain
makes setup-go install Go 1.25.x everywhere; the version check is
language-level, so build-go 1.25.x against toolchain 1.25.11 passes
(verified locally with golangci-lint built on go1.25.1).

https://claude.ai/code/session_01CSfFSb43dMeARphdRnC1mE
@sonarqubecloud

Copy link
Copy Markdown

@aksOps aksOps merged commit 0919ebe into main Jun 11, 2026
13 checks passed
@aksOps aksOps deleted the claude/remove-tmux-dependency-f8dw47 branch June 11, 2026 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants